Biological Pattern Discovery with R Machine Learning Approaches (Zheng Rong Yang)

function, such as

ߝൌ¹

ܰ^෍ሺݕො

ே

௡^െݕ௡^ሻ^ଶ

௡ୀଵ

(3.33)

ting all model parameters using a vector notation w and a data

ܠ௡, the estimation of w is an optimisation process

ܟෝൌmin

ୟ୰୥^൝1

ܰ^{෍ሺ݂ሺܠ}^௡^{, ܟෝሻെݕ}^௡^ሻ^ଶ

ே

௡ୀଵ

ൡ

(3.34)

use this ݂ሺܠ௡, ܟෝሻ is a nonlinear function, there will not be an

format of the estimated model parameters such as the one used in

odels. Therefore, an algorithm called the backward propagation

m has been developed to estimate parameters for a MLP model

art, et al., 1986]. The backward propagation algorithm is a variant

ewton’s method [Fletcher, 1987]. It is based on the derivative

of an error function, which is defined in an equation shown

here 0 ൏ߟ൏1 is called the learning rate,

Δܟൌെߟ^׏ߝ

׏ܟ

(3.35)

se of the above equation means that the change (update) of every

el parameter is negatively proportional to the derivative function

rror function, ܟ^௧ାଵൌܟ^௧െߟߘߝ^௧ߘܟ^௧.

⁄

A great derivative

points to a steep or sharp segment curve of the error function.

e, a cautious update of the model parameters is required. This

move can avoid missing the optimal point (jumping over the

oint) of the error function curve. A small derivative function

nds to a more flattened curve segment of the error function.

e, a greedy or greater update can happen to the model parameters.

ver, this simple update may cause oscillation, i.e. the over-update

p over the saddle point of the error function curve forward and